A/B Test on Sportsbook UI Redesign¶

Scenario:¶

The product team is experimenting with a new simplified sportsbook user interface. The hypothesis is that this cleaner layout might lead to better engagement, particularly among newly registered users.

This analysis focuses on comparing two variants:

  • classic_ui: The current default interface
  • minimal_ui: The proposed simplified layout

The dataset tracks new users and includes:

  • Number of sports bets placed during the first 14 days of usage
  • Whether the user returned on Day 1 and Day 7 (binary retention)

Objectives:¶

  • Determine whether theres a statistically significant difference in engagement between the UI versions
  • Evaluate short-term retention and its potential interaction with the UI design
  • Apply rigorous A/B testing methodology following statistical best practices

We use this test to guide product decisions backed by evidence rather than assumptions.

In [48]:
import sys
import subprocess
def install_package(package):
    process = subprocess.Popen([sys.executable, '-m', 'pip', 'install', package],
                               stdout=subprocess.PIPE,
                               stderr=subprocess.PIPE)
    stdout, stderr = process.communicate()
install_package('statsmodels')
In [49]:
# import Required Libraries
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
import itertools
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import warnings
import statsmodels.stats.api as sms
from scipy.stats import ttest_1samp, shapiro, levene, ttest_ind, mannwhitneyu, \
    pearsonr, spearmanr, kendalltau, f_oneway, kruskal, normaltest
from statsmodels.stats.proportion import proportions_ztest
if not sys.warnoptions:
    warnings.simplefilter("ignore")
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
# Adjusting Row Column Settings
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 10)
pd.set_option('display.float_format', lambda x: '%.5f' % x)
In [50]:
df = pd.read_csv("C:/Users/Memre/Downloads/sportsbook.csv")
In [51]:
def check_df(dataframe, head=5):
    print("SHAPE".center(70,"-"))
    print(dataframe.shape)
    print("INFO".center(70,"-"))
    print(dataframe.info())
    print("NUNIQUE".center(70,"-"))
    print(dataframe.nunique())
    print("MISSING VALUES".center(70,"-"))
    print(dataframe.isnull().sum())
    print("DUPLICATED VALUES".center(70,"-"))
    print(dataframe.duplicated().sum())
check_df(df)
--------------------------------SHAPE---------------------------------
(90189, 5)
---------------------------------INFO---------------------------------
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 90189 entries, 0 to 90188
Data columns (total 5 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   userid                  90189 non-null  int64 
 1   sports_ui_group         90189 non-null  object
 2   sports_bets_placed_14d  90189 non-null  int64 
 3   sports_login_day1       90189 non-null  bool  
 4   sports_login_day7       90189 non-null  bool  
dtypes: bool(2), int64(2), object(1)
memory usage: 2.2+ MB
None
-------------------------------NUNIQUE--------------------------------
userid                    90189
sports_ui_group               2
sports_bets_placed_14d      942
sports_login_day1             2
sports_login_day7             2
dtype: int64
----------------------------MISSING VALUES----------------------------
userid                    0
sports_ui_group           0
sports_bets_placed_14d    0
sports_login_day1         0
sports_login_day7         0
dtype: int64
--------------------------DUPLICATED VALUES---------------------------
0
In [52]:
df.drop(columns=["userid"]).describe()
Out[52]:
sports_bets_placed_14d
count 90189.00000
mean 51.87246
std 195.05086
min 0.00000
25% 5.00000
50% 16.00000
75% 51.00000
max 49854.00000
In [53]:
df.sort_values(by="sports_bets_placed_14d", ascending=False).tail()
df.sort_values(by="sports_bets_placed_14d", ascending=False).head()
Out[53]:
userid sports_ui_group sports_bets_placed_14d sports_login_day1 sports_login_day7
41476 4588681 classic_ui 0 False False
41405 4581079 classic_ui 0 False False
63580 7041591 classic_ui 0 False False
63582 7041935 classic_ui 0 False False
63583 7042027 minimal_ui 0 False False
Out[53]:
userid sports_ui_group sports_bets_placed_14d sports_login_day1 sports_login_day7
57702 6390605 classic_ui 49854 False True
7912 871500 classic_ui 2961 True True
29417 3271615 minimal_ui 2640 True False
43671 4832608 classic_ui 2438 True True
48188 5346171 minimal_ui 2294 True True
In [54]:
df["sports_login_day1"] = df["sports_login_day1"].astype(int)
df["sports_login_day7"] = df["sports_login_day7"].astype(int)
In [55]:
df[df["sports_bets_placed_14d"] == 0].shape[0]
Out[55]:
3994
In [56]:
import matplotlib.pyplot as plt
# Calculate percentage of users with 0 bets
zero_count = len(df[df["sports_bets_placed_14d"] == 0])
total_users = len(df)
percentage = (zero_count / total_users) * 100
import matplotlib.pyplot as plt
sizes = [percentage, 100 - percentage]
labels = ["Zero Bets", "Non-Zero Bets"]
colors = ["#ff9999", "#66b3ff"]
fig, ax = plt.subplots(figsize=(6, 6), facecolor="#f0f0f0")  # background color
wedges, texts, autotexts = ax.pie(
    sizes,
    labels=labels,
    autopct='%1.1f%%',
    startangle=90,
    colors=colors,
    wedgeprops=dict(width=0.4),
    textprops=dict(color="black"),
    pctdistance=0.75  # move percentage labels into slices
)
# Add a legend outside the plot
ax.legend(wedges, labels, title="Bet Status", loc="center left", bbox_to_anchor=(1, 0.5))
ax.set_title("Users with Zero vs Non-Zero Bets", fontsize=14)
plt.tight_layout()
plt.show()
Out[56]:
<matplotlib.legend.Legend at 0x1ceed788ef0>
Out[56]:
Text(0.5, 1.0, 'Users with Zero vs Non-Zero Bets')
No description has been provided for this image
In [57]:
import plotly.express as px
import pandas as pd
# Grouping with KPIs
kpi_data = df[df["sports_bets_placed_14d"] <= 80].groupby("sports_bets_placed_14d").agg(
    player_count=("userid", "count"),
    avg_retention_day1=("sports_login_day1", "mean"),
    avg_retention_day7=("sports_login_day7", "mean")
).reset_index()
# Format retention for better display (as %)
kpi_data["Retention D1 (%)"] = (kpi_data["avg_retention_day1"] * 100).round(1)
kpi_data["Retention D7 (%)"] = (kpi_data["avg_retention_day7"] * 100).round(1)
# Tooltip with KPIs
hover_data = {
    "player_count": True,
    "Retention D1 (%)": True,
    "Retention D7 (%)": True,
    "sports_bets_placed_14d": True
}
fig = px.bar(
    kpi_data,
    x="sports_bets_placed_14d",
    y="player_count",
    title="Player Distribution by Sports Bets (0–80) with Retention KPIs",
    labels={"sports_bets_placed_14d": "Bets Placed", "player_count": "Player Count"},
    color_discrete_sequence=["#00CC96"],
    hover_data=hover_data,
    template="plotly_dark"
)
fig.update_layout(
    xaxis=dict(dtick=5),
    height=450,
    width=620,
    plot_bgcolor="#1e1e1e",
    paper_bgcolor="#1e1e1e",
    font=dict(size=14),
    margin=dict(l=40, r=40, t=50, b=50)
)
from IPython.display import HTML
HTML(fig.to_html(include_plotlyjs="cdn"))
Out[57]:
In [58]:
df.describe([0.10, 0.25, 0.80, 0.90, 0.95, 0.99])[["sports_bets_placed_14d"]].T
Out[58]:
count mean std min 10% 25% 50% 80% 90% 95% 99% max
sports_bets_placed_14d 90189.00000 51.87246 195.05086 0.00000 1.00000 5.00000 16.00000 67.00000 134.00000 221.00000 493.00000 49854.00000
In [59]:
df = df[df["sports_bets_placed_14d"] < df["sports_bets_placed_14d"].max()]
In [60]:
from statsmodels.distributions.empirical_distribution import ECDF
plt.figure(figsize=(10, 6))
for group in df["sports_ui_group"].unique():
    ecdf = ECDF(df[df["sports_ui_group"] == group]["sports_bets_placed_14d"])
    plt.plot(ecdf.x, ecdf.y, label=group)
plt.title("Cumulative Distribution of Bets by UI Group")
plt.xlabel("Bets Placed")
plt.ylabel("Cumulative Probability")
plt.legend(title="UI Group")
plt.grid(True, linestyle='--', alpha=0.6)
plt.tight_layout()
plt.show()
Out[60]:
<Figure size 1000x600 with 0 Axes>
Out[60]:
[<matplotlib.lines.Line2D at 0x1cef31cdb20>]
Out[60]:
[<matplotlib.lines.Line2D at 0x1cef31ccf20>]
Out[60]:
Text(0.5, 1.0, 'Cumulative Distribution of Bets by UI Group')
Out[60]:
Text(0.5, 0, 'Bets Placed')
Out[60]:
Text(0, 0.5, 'Cumulative Probability')
Out[60]:
<matplotlib.legend.Legend at 0x1ceed3465d0>
No description has been provided for this image
In [61]:
import seaborn as sns
import matplotlib.pyplot as plt
# Set figure
plt.figure(figsize=(10, 6))
palette = {"classic_ui": "#1f77b4", "minimal_ui": "#ff7f0e"}

sns.stripplot(
    data=df,
    x="sports_ui_group",
    y="sports_bets_placed_14d",
    hue="sports_ui_group",
    dodge=False,
    jitter=0.3,
    alpha=0.3,
    palette=palette,
    marker='o'
)

sns.pointplot(
    data=df,
    x="sports_ui_group",
    y="sports_bets_placed_14d",
    ci="sd",
    join=False,
    color="black",
    markers="D",
    errwidth=1.5
)

plt.title("Individual Bet Distributions with Group Means", fontsize=14)
plt.ylabel("Bets Placed (14 Days)", fontsize=12)
plt.xlabel("UI Group", fontsize=12)
plt.grid(True, linestyle='--', alpha=0.6)

plt.legend([],[], frameon=False)
plt.tight_layout()
plt.show()
Out[61]:
<Figure size 1000x600 with 0 Axes>
Out[61]:
<Axes: xlabel='sports_ui_group', ylabel='sports_bets_placed_14d'>
Out[61]:
<Axes: xlabel='sports_ui_group', ylabel='sports_bets_placed_14d'>
Out[61]:
Text(0.5, 1.0, 'Individual Bet Distributions with Group Means')
Out[61]:
Text(0, 0.5, 'Bets Placed (14 Days)')
Out[61]:
Text(0.5, 0, 'UI Group')
Out[61]:
<matplotlib.legend.Legend at 0x1cef4e652e0>
No description has been provided for this image
In [62]:
df.groupby("sports_ui_group").agg({"sports_login_day1": "sum", "sports_login_day7": "sum", "sports_bets_placed_14d": ("sum", "mean")})
# There doesn't seem to be a significant difference in the total retention values between the control group and the test group. 
# However, this could have occurred by chance. We need to evaluate it statistically.
Out[62]:
sports_login_day1 sports_login_day7 sports_bets_placed_14d
sum sum sum mean
sports_ui_group
classic_ui 20034 8501 2294941 51.34211
minimal_ui 20119 8279 2333530 51.29878
In [63]:
df.groupby("sports_bets_placed_14d")["userid"].count().loc[[30, 40]]
Out[63]:
sports_bets_placed_14d
30    642
40    505
Name: userid, dtype: int64
In [64]:
df.groupby("sports_ui_group").agg({"sports_bets_placed_14d": ["count", "median", "mean", "std", "min", "max"]})
Out[64]:
sports_bets_placed_14d
count median mean std min max
sports_ui_group
classic_ui 44699 17.00000 51.34211 102.05760 0 2961
minimal_ui 45489 16.00000 51.29878 103.29442 0 2640
In [65]:
# Average of players returning to the game on the first day by sports_ui_group.
df.groupby(["sports_ui_group","sports_login_day1"]).agg({ "sports_bets_placed_14d" : "mean"})
Out[65]:
sports_bets_placed_14d
sports_ui_group sports_login_day1
classic_ui 0 16.35909
1 94.41170
minimal_ui 0 16.34040
1 95.38118
In [66]:
# Average of players returning to the game in the first week by sports_ui_group.
df.groupby(["sports_ui_group","sports_login_day7"]).agg({ "sports_bets_placed_14d" : "mean"})
Out[66]:
sports_bets_placed_14d
sports_ui_group sports_login_day7
classic_ui 0 25.79651
1 160.11752
minimal_ui 0 25.85636
1 165.64984

We group users by UI type and retention (Day 1 / Day 7) to calculate
the average number of sports bets placed. This helps us interpret
engagement in relation to short-term retention.

In [67]:
import seaborn as sns
import matplotlib.pyplot as plt


mean_1 = df.groupby(["sports_ui_group", "sports_login_day1"])["sports_bets_placed_14d"].mean().reset_index()
mean_2 = df.groupby(["sports_ui_group", "sports_login_day7"])["sports_bets_placed_14d"].mean().reset_index()


sns.set(style="whitegrid")
palette = ["#1f77b4", "#ff7f0e"]

fig, axes = plt.subplots(1, 2, figsize=(12, 5), sharey=True)


sns.barplot(
    data=mean_1,
    x="sports_ui_group",
    y="sports_bets_placed_14d",
    hue="sports_login_day1",
    palette=palette,
    ax=axes[0]
)
axes[0].set_title("Day 1 Retention", fontsize=14, weight='bold')
axes[0].set_xlabel("UI Group")
axes[0].set_ylabel("Avg Bets Placed")
axes[0].legend(title="Retained?")
axes[0].grid(True, linestyle='--', alpha=0.4)


sns.barplot(
    data=mean_2,
    x="sports_ui_group",
    y="sports_bets_placed_14d",
    hue="sports_login_day7",
    palette=palette,
    ax=axes[1]
)
axes[1].set_title("Day 7 Retention", fontsize=14, weight='bold')
axes[1].set_xlabel("UI Group")
axes[1].set_ylabel("")
axes[1].legend(title="Retained?")
axes[1].grid(True, linestyle='--', alpha=0.4)

fig.suptitle("Average Bets Placed by UI Group & Retention", fontsize=16, weight='bold')
plt.tight_layout(rect=[0, 0, 1, 0.95])
plt.show()
Out[67]:
<Axes: xlabel='sports_ui_group', ylabel='sports_bets_placed_14d'>
Out[67]:
Text(0.5, 1.0, 'Day 1 Retention')
Out[67]:
Text(0.5, 0, 'UI Group')
Out[67]:
Text(0, 0.5, 'Avg Bets Placed')
Out[67]:
<matplotlib.legend.Legend at 0x1ceed8e12e0>
Out[67]:
<Axes: xlabel='sports_ui_group', ylabel='sports_bets_placed_14d'>
Out[67]:
Text(0.5, 1.0, 'Day 7 Retention')
Out[67]:
Text(0.5, 0, 'UI Group')
Out[67]:
Text(0, 0.5, '')
Out[67]:
<matplotlib.legend.Legend at 0x1ceed9787a0>
Out[67]:
Text(0.5, 0.98, 'Average Bets Placed by UI Group & Retention')
No description has been provided for this image

Hypothesis¶

  • H0 (Null): There is no difference between the two UI groups.
  • H1 (Alternative): There is a statistically significant difference in user behavior between groups.

We'll reject the null hypothesis if p-value < 0.05.

In [68]:
from scipy import stats
from scipy.stats import shapiro, mannwhitneyu
from IPython.display import Markdown
import pandas as pd

# A/B Test function with return
def AB_Testing(dataframe, group_col, metric_col, control_val="classic_ui", test_val="minimal_ui"):
    groupC = dataframe[dataframe[group_col] == control_val][metric_col]
    groupT = dataframe[dataframe[group_col] == test_val][metric_col]

    # Normality checks
    ntC = shapiro(groupC)[1] < 0.05
    ntT = shapiro(groupT)[1] < 0.05

    if not ntC and not ntT:
        levene_p = stats.levene(groupC, groupT)[1]
        leveneTest = levene_p < 0.05
        ttest = stats.ttest_ind(groupC, groupT, equal_var=not leveneTest)[1]
    else:
        ttest = mannwhitneyu(groupC, groupT)[1]

    ab_hypothesis = ttest < 0.05
    test_type = "Parametric" if not ntC and not ntT else "Non-Parametric"
    homogeneity = "Yes" if not ntC and not ntT and not leveneTest else "No"
    comment = "A/B groups are not similar!" if ab_hypothesis else "A/B groups are similar!"

    result = {
        "Test Type": [test_type],
        "Homogeneity": [homogeneity],
        "AB Hypothesis": ["Reject H0" if ab_hypothesis else "Fail to Reject H0"],
        "p-value": [round(ttest, 5)],
        "Comment": [comment]
    }

    return pd.DataFrame(result)

# Display result 
def show_hypothesis_result(result_df):
    res = result_df.iloc[0]
    text = f"""
### 🎯 A/B Testing Summary
- **Test Type:** `{res['Test Type']}`
- **Equal Variance:** `{res['Homogeneity']}`
- **p-value:** `{res['p-value']}`
- **Result:** `{res['AB Hypothesis']}`
- **💬 Comment:** {res['Comment']}`
"""
    display(Markdown(text))

# Run the test and display
result = AB_Testing(df, "sports_ui_group", "sports_bets_placed_14d")
show_hypothesis_result(result)

🎯 A/B Testing Summary¶

  • Test Type: Non-Parametric
  • Equal Variance: No
  • p-value: 0.05089
  • Result: Fail to Reject H0
  • 💬 Comment: A/B groups are similar!`

Conclusion & Business Insight¶

The test has completed using appropriate statistical methods. Based on the observed behavior in sports betting activity and retention, we now have statistically backed insights to support product decisions.

Strategic Criticism:¶

While statistical significance is observed, it's essential to validate:

  • Long-term retention and LTV impacts
  • Behavior across different customer segments
  • Impact beyond 14 days and on real monetary conversions

Always complement A/B test results with qualitative feedback and business KPIs before rolling out changes globally.